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FOREWORD 

This report summarizes the development work of Advanced Instruments 
tion Fault Diagnostics for Regenerative Environmental Control and 
Life Support Systems conducted by Life Systems, Inc., during the 
period of October, 1978, to October, 1979, under NASA Contract 
NAS2- 10050. The Program Manager was Dr. P. Y. Yang. Technical 
Support was provided by Dr. K. C. You, Dr, R. A. Wynveen, J. D. 
Powell, III, J. R. Gyorki, F. H. Schubert and J. D. Powell, Jr. 
Administrative and docuim.ntation support was provided by R. H. 
Kohler, B. A. Ginunas, D. A. Jones, B. M. Jaras and M. Prokopcak. 

The authors wish to acknowledge the technical contributions, 
support and program guidance offered by the program Technical 
Monitor, P. D. Qua tt rone. Chief, Advanced Life Support Office, 
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SUMMARY 

Bevel jpment of regenerative environmental control and life support systems 
requires instrumentation characteristics which evolve from one development 
phase to another. As the development phase moves toward flight hardware, the 
system availability becomes an important design aspect which requires high 
reliability and maintainability. As a part of the continuous development 
effort, a program to evaluate, design and demonstrate advanced instrumentation 
fault diagnostics was successfully completed at Life Systems, Inc. Fault 
tolerance designs for reliability and other instrumentation capabilities to 
increase maintainability were evaluated and studied. 

The major accomplishments of the development program were; 

• Fault tolerance design 

• Built-in diagnostic design 

• Actuator fault prediction evaluation and analysis 

• Spare sensor calibration curve retention evaluation 

• Automatic in situ sensor calibration study 

• Data storing for fault isolation study 

• Demonstration of intercomputer links with error checking capability 

• Demonstration of microprocessor-based built-in diagnostic circuit 

One of the most important benefits of the program is the elimination of the 
"instrumentation-lagging" syndrome -- the instrumentation is reaching its 
maturity at the same rate as the regenerative process development for the 
spacecraft atmosphere revitalization. The program also established common- 
ality of instrumentation for environmental control and life support subsystems; 
in addition, it reduced the instrumentation development effort and the risks 
for individual subsystems as well as the integrated systems . The program 
provided a focal point for the instrumentation development effort and resulted 
in an extremely well coordinated program to ensure the instrumentation readi- 
ness for long-term manned space missions. 

It was concluded from the studies and demonstrations conducted under this 
program that built-in circuits to test the integrity of electronic components 
are required to ensure system safety and ultimately to provide fault tolerance 
for higher reliability. It was also concluded that maintenance aids for 
better maintainability are required. Intercomputer communication with error 
checking and software recovery algorithms will enhance the instrumentation 
reliability. Redundancy at the component level is recommended because it 
provides higher reliability than that provided by redundancy at the system or 
assembly level. Further investigation of actuator signature analysis using 
audio signatures instead of vibration is strongly recommended. Spare sensor 
calibration curves should be retained at the sensor head as a part of the 
sensor using advanced electronic technology to minimize size, power consump- 
tion and maintenance effort. Automatic in situ sensor calibration is readily 
applicable to current, voltage, speed and combustible gas sensors. Use of a 
state transition approach for real-time diagnosis is recommended because of 
the smaller memory size requirement compared to the approach of recording and 
storing data for fault isolation diagnosis after a malfunction. 
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INTRODUCTION 

Instrumentation is required to control and monitor Environmental Control/Life 
Support Systems (EC/LSS) . The requirement of EC/LSS instrumentation begins 
with the essential functions to control the chemical, physical and electro- 
chemical processes such as carbon dioxide (CO^) removal, water electrolysis, 

CO2 reduction and nitrogen (N2) generation. "^As more sophisticated functions 
are introduced, safety functions to protect personnel and ensure equipment 
integrity are needed. Finally, the requirement for high reliability and 
maintainability and for low weight, size and power consumption will be imposed 
upon the designer by long duration flight missions. 

Background 

f 1-23’) 

Regenerative EC/LSS processes have been under development for many years. ^ ^ 

The EC/LSS consists of two major systems; the regenerative Air Revitalization 
System (ARS) and the Waste Water Managem-nt System (WWMS) , Life Systems, 

Inc., (LSI) has been invblved in the design, development and testing of ARS 
subsystems to remove excess moisture from the air, concentrate CO„ from the 
air, reduce CO2 to water and methane or carbon, generate oxygen (62) from 
water, resupply N2 and provide N2 and hydrogen (H2) separation. In addition, 
LSI has also developed a separate Electrochemical Air Revitalization System 
(EARS) and a water reclamation subsystem using Vapor Compression Distillation 
(VCD) technology. 

Because the applications of the National Aeronautics and Space Administration 
(NASA) requires low launch weight and long operating life based on in-flight 
maintenance, it is essential that the Control and Monitor Instrumentation 
(C/M I) development is in pace with the electrochemical and mechanical process 
development. This is especially important because the actual flight of the 
EC/LSS hardware is several years in the future. Because the technology asso- 
ciated with components of the electronic engineering field is expanding rapidly, 
the advancements projected for the electronics industry must be taken into 
consideration now when designing the C/M I for the advanced EC/LSS processes. 

Previous Efforts 


In general, instrumentation development efforts include the- following eight 
areas : 

1. Integration of subsystems into a complete system. 

2. Development of instrumentation interior architecture including 
processor, logic, memory, input/output (I/O), signal conditioner, 
power conditioner, analog/digital (A/D) interface and power supply. 

3. Development of Test Support Accessories (TSA) instrumentation and 
interfaces . 


(1--23) References cited at the end of this report. 
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4. Development of operator/system interface. 

5. Development of system maintenance aids, 

6. Incorporation of advanced instrumentation concepts. 

7. Incorporation of the developer's knowledge of operation. 

8. Development of instrumentation packaging. 

Figure 1 shows the relationship among these eight developmental areas. As 
shown in the figure, TSA development (not an instrumentation development area) 
is related to Areas 1 and 3. Subsystem component and performance evaluationno-v 
is related to Area 7. \reas 1 through 5 have been addressed previously.^ ^ 
Areas 4 and 5, development of operator/system interface and system maintenance 
aids, received special^attention under Contract NAS2-9253, "EG/LSS Maintenance 
Instrumentation."^ ’ ^ The present Contract, NAS2- 10050, addressed Area 6; 

incorporation of advanced instrumentation concepts. 

R esearch and Development Type Control/Monitor Instrumentation 

The EC/LSS instrumentation advanced through a series of subsystem development 
programs, including^CO^N removal, CO^ reduction, 0^ generation and genera- 
tion.^ ^ * ’ > / J instrumentation trend is depicted in Figure 2. 

In the past, the C/M I was typically designed with hard-wired logic circuits 
with fault detection, built-in checkout and limited fault avoidance/prediction 
capability. Instrumentation adjustments were made through potentiometers and 
switches. Displays typically used multiple level indicators and panel meters. 

The present generation of C/M I is designed around a minicomputer. Flexibility 
and operator/ system interface are emphasized because it is primarily designed 
for the development and testing of EC/LSS process hardware under a laboratory 
research and development (R&D) environment. The present R&D type C/M I fea- 
tures cathode-ray tube (CRT) message display, advanced operator command key- 
board, fault avoidance, fault prediction, fault detection, R&D flexibility and 
interface to a hard-copier/Data Acquisition and Reduction System (BARS). 

Based on this technology, a series of minicomputer-based, dedicated instrumen- 
tation hardware was developed for controlling and monitoring a variety of 
experimental EC/LSS hardware including an experimental, integrated one-person 
Air Revitalization System (ARX-1) and four subsystems for the three-person 
Regenerative Life Support Evaluation (RLSE) program. The RLSE subsystems 
involved were*. 

a. Electrochemical Depolarized CO 2 Concentrating Subsystem (CS-3) 

b. Independent Air Revitalization'^Subsystem (lARS) 

c. Vapor Compression Distillation Subsystem (VCDS) 

d. Sabatier CO 2 Reduction Subsystem (S-CRS) 

Figure 3 shows the ARX-1 C/M I and Figure 4 shows the four RLSE subsystem 
C/M I's. For communication and documentation purposes, these C/M I enclosures 
were designated Model 100 laboratory R&D C/M I. 
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FIGURE 2 TREND OF EC/LSS INSTRUMENTATION 
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Program Objectives 

The overall EC/LSS instrumentation development program objectives are to reduce 
size and to increase system availability (reliability and maintainability). The 
goal is to ready the C/M I for the space flight mission several years from now. 
Figure 5 shows the two dimensions of the advanced C/M I R&D thrust. One is the 
development thrust toward the flight hardware C/M I. It is projected that this 
development will go through three generations with the present Model 100 being 
the first for laboratory breadboard EC/LSS hardware development and testing. 

The next generation, Model 200, will be dedicated to prototype hardware. 

Finally, Model 300 will be used for flight hardware rpplications. The other 
dimension of the R&D effort is the engineering thrust within each of the three 
C/M I generations to improve quality, eliminate weak links and increase capa- 
bility but not change the instrumentation architecture. The space thrust is 
the major driving force to push development toward increasing capability per 
unit size, incorporating new components and concepts and increasing the avail- 
ability per unit size. 

D esign Guidelines 

The design guidelines established by the NASA included: 

1. Employ design commonality for lower development cost and lower user 
cost. 

2. Emphasize flexibility and development capability during the labora- 
tory breadboard stages while allowing and requiring minimum effort 
to redesign for dedicated flight hardware. 

3. Provide instrumentation hardware and techniques for users tnat do 
not have electronics or computer engineering background, 

4. Allow expandability and compatibility for continuous upgrading as 
electronic technology advances. 


Specific Tasks and Objectives 


The specific tasks of the program were to evaluate and/or design advanced con- 
cepts in the fault diagnostics area. The concepts and design were those felt 
necessary for the advancement of instrumentation toward application of EC/LSS 
flight hardware with high availability and safety. These tasks and their objec- 
tives were to: 

1 , Study the incorporation of fault tolerance concepts which would 
enable the C/M I to detect and bypass faults within the instrumenta- 
tion itself. A combination of techniques such as data/information 
transmission error checking and instrumentation redundancy were 
evaluated, 

2. Evaluate the use of microprocessors to provide for checkout and 
diagnostic circuits to verify the integrity of the instrumentation 
and allow maintenance at the line replaceable component (LRC) or 
line replaceable unit (LRU) level. 
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3. Cnvestigate the concept of using initial actuator signatures for 
periodic comparisons with the real-time actuator signatures as part 
of the fault prediction/isolation concept advancement. 

4. Outline methods for and hardware size impacts of providing retention 
of calibration curves in the computer memory for spare sensors to be 
used following the isolation and replacement of a faulty sensor. 

5. Specify the advantages of automatic in situ sensor calibration and 
recommend types of sensors applicable within an ARS. 

6. Evaluate the need and techniques for recording operating parameters 
and conditions for each out-of-tolerance event occurring within the 
ARS hardware. Focus was on the time proceeding and following an 
out-of-tolerance event, allowing the data collected to be usjjd for 
subsequent diagnostics. 

7. Determine the most promising advanced C/M I approaches and demon- 
strate the capability. 

FAULT DIAGNOSTICS DESIGN 

Instrumentation for terrestrial, industrial processes typically limitc its 
functions to measuring and controlling variables of the process or system as 
accurately as necessary.^ ^ These two functions, measurement and control, 
and the capability to interface with the operator or other instrument systems 
constitute the minimum and essential functions the instrumentation must perform. 

In addition to the minimum functions, the C/M I must also provide monitor 
functions to ensure safety in case of any malfunctions. The safety aspects of 
the instrumentation include fault detection and fail-safe operation. More 
recently, fault isolation and sometimes fault correction instructions for the 
operator are included in indui/tria^instrumentation because of the increasing 
demand for ease of maintenance.^ ' ^ 

Additions to the essential and safety functions of C/M I are typically designed 
to increase the overall process reliability or raa^ni^^inability to the level 
established by the applications specifications.^ ^ A high reliability 
means long duration operation without failures. A high maintainability means 
that if a failure does occur, it can be corrected in a short period of time. 

The combination of high reliability and high maintainability means high pro- 
cess availability, Instrumentation fault diagnostic functions can be incor- 
porated to increase the process availability. In EC/LSS instrumentation 
design, the incorporation of fault diagnostic capabilities into the process 
hardware means adding more sensors, mechanical components and instrumentation 
to the flight hardware. An analysis, therefore, is needed to establish the 
fault diagnostic capabilities needed to increase process availability, the 
fault diagnostic design techniques available for EC/LSS application and the 
weight, volume and power consumption penalties associated with each additional 
capability. 
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Scope of Fault Diagnostics 

Fault diagnostics typically include fault detection, fault isolation and fault 
correction instructions. The objectives of the fault diagnostics are to 
protect the system and personnel and to increase the system availability; 
therefore, a broader and preferred definition of fault diagnostics is ''any 
functions designed to avoid, predict, detect, isolate or correct a component 
failure." Before a failure actually occurs the instrumentation should be 
designed to avoid as many faults as possible and to predict a failure when it 
has become unavoidable. The sequence of fault diagnostics thus begins with 
fault avoidance followed by fault prediction as shown in Table 1, When a 
failure has occurred, the fault detection fin'-tj^iR next in the sequence should 
convey to the operator that a failure has h5pf>'.tf;d and then automatically 
trigger the maintenance aid functions: fault i. lation and fault correction 
instructions. The ultimate goal of the instrumentation is to tolerate fail- 
ures to a certain extent and maintain the system operation in spite of failures. 
The fault tolerance function is sometimes referred to as self-healing or 
self-correcting. A partial fault tolerance is called fail-soft or fail 
gracefully. 


Reliability, Maintainability and Availability 

To an EC/LSS design engineer, reliability is defined as "the probability that 
at a specified time, the EC/LSS is performing the functions as designed under 
specified conditions for an interval of duration."^ ■ Note that reliability 
is measured in terms of probability, expressed in meaningful quantitative 
terms and evaluated through applicable statistical methods. Being a probabil- 
ity, reliability is often stated as a positive number less than one, for 
example, 0.95. One of the clas'iical reliability measures is raean-time-between- 
failures (MTBF). Time is an essential part of the reliability definition. 

For EC/LSS the terra reliability often means the mission reliability which is 
the probability that the EC/LSS will operate without malfunctions for the 
duration of a mission. 

For EC/LSS users, a more important system quality measurement is availability . 
Availability is the probability that the system is operating satisfactorily at 
any point in time when used under stated conditions, where the total time con- 
sidered includes operating time, active repair time, administrative time and 
logistic time. The mission availability (also called interval availability) 
is the expected fraction of a mission duration that the system will operate 
within the tolerances. 

Availability is derived from reliability and maintainability. Maintain- 
ability is the probability that when maintenance action is performed under 
stated conditions a failed system will be restored to operable conditions 
within a specified total downtime. A classic measure of maintainability is 
the mean-time-to-repair (MTTR) . 


Specific Tasks 

The specific fault diagnostic areas addressed under this program were: 
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• study fault tolerance concepts such as using transmission error 
checking and instrumentation redundancy 

• Evaluate microprocessor-based self diagnostic circuit 

• Investigate actuator fault prediction concept using signature 
analysis techniques 

• Study spare sensor calibration curve retention 

• Study automatic in situ sensor calibration 

• Evaluate recording and storing of data for fault isolation 

• Select and demonstrate the most promising one or two of the above 
approaches 

The relationship between these tasks and the primary fault diagnostic levels 
is shown in Table 2, The potential impact of these functions on the instrumen- 
tation reliability, maintainability and availability are depicted in Table 3. 

Fault Tolerance Concepts 


Fault tolerance is the highest level fault diagnostic design which requires 
fault detection, isolation and self-repairing, adjustment or reconfiguration. 

Transmission Error Checking . Transmission errors can be checked using informa- 
tion coding techniques. The following error checking techniques were evaluated 
for detecting transmission failures: 

• Generate and compare a checkword of a number of data words 

• Generate and compare odd or even parity of a word 

The checkword can be generated by the summation of a block of data. In this 
case, the checkword is commonly referred to as the checksum of the data block. 
The checkword can also be generated by using other coding schemes such as 
cyclic redundancy code (CRC). In this case, the checkword is the coded depen- 
dent variable urging a known equatiogoWith a block of data as the input indepen- 
dent variables to the equation.'’ ’ ^ 


Parity c>.ecking is an established technique for digital data transmission 
error checking. In this technique, a parity bit is generated for each word 
transmitted to form an odd or even parity. 


Transmission error checking alone is a fault detection function. Fault tol- 
erance capability, however, can be achieved by issuing repetitive retrials 
after an error is detected. Retry of transmission is a form of software 
redundancy; and statistics have shown that 96% of peripherals storage to memory 
transmission errors can be corrected after three retries.^ 


lot EC/LSS application, data and information are frequently transmitted back 
and forth between subsystems themselves as well as from subsystems to the 
central instrumentation. It is recommended that transmission error checking 
using the checksum technique be incorporated as a minimum. This capability is 
demonstrated as a part of this program effort and is discussed later in this 
report . 

Use of Redundancies . A basic approach to fault tolerance is to use redundancies. 
As shown in Figure 6, dual redundancy provides fault detection but not tolerance. 
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TABLE 2 SPECIFIC TASKS VERSUS PRIMARY FAULT 
DIAGNOSTIC LEVELS 


Tasks 

Fault Tolerance 

Built-in Diagnostic 

Actuator Signature Analysis 

Calibration Curve Retention 

In Situ Calibration 

Data Storing for Fault Isolation 


fa’) 

Fault Diagnostic Levels 

FD ■ n fcT ft 


V V 

V 

V 

V 


(a) FA = Fault Avoidance, FP = Fault Prediction, FD = Fault Detection, 
FI = Fault Isolation, FCI = Fault Correction Instructions and 
FT = Fault Tolerance 
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TABLE 3 FUNCTIONS STUDIED AND TKETR PRIMARY RELATIONS 
TO SAFETY, RELIABILITY AI® MAINTAINABILITY 


Functions 


Related 

to 

Safety 

Reliability 

Maintainability 

Fault Tolerance 


4 


Built-in Diagnostic 

V 

4 

4 

Actuator Signature Analysis 



4 

Calibration Curve Retention 



4 

In Situ Calibration 


4 

4 

Data Storing for Fault Isolation 



4 
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Advanced Fault Tolerance 


FIGURE 6 FAULT DETECTION AND TOLERANCE USING REDUNDANCY 
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Triple redundancy provides fault tolerance of a single failure. Triple redun- 
dancy with a spare provides fault tolerance of two failures. 

Instead of redundancy comparisons, a signal conditioner built-in checkout 
(BIG) or a computer built-in diagnostic (BID) circuit can be used to detect 
failures. Using a BIG or BID circuit and a spare or redundancy, a fault 
tolerance of one failure can be achieved. Figure 7 depicts the configuration 
using BIG/BID. 

Level of Redundancies . Redundancies can be implemented at different levels 
such as parts, components, assemblies and systems. Figure 8 shows the reli- 
ability curves of four different configurations resulting from a recent space- 
borne fault tolerance study, ^ ^ It was shown that using a single computer 

with voting logic and spares within the same computer will produce the highest 
reliability. This is illustrated in the calculations and comparison of Figure 9. 
Redundancies at assembly or system levels are sometimes required in order for 
the system recovery speed to a faulty component to be adequate. The total 
time between a failure and the completion of the recovery/ reconfiguration 
sequence is critical for some applications such as spacecraft reentry or 
launch control computers. For EG/LSS, recovery speed is not critical and 
redundancies at the component level are feasible and recommended. 

Built-in Diagnostic and Gheckout Gircuits 

As shown in Figure 10, the concept of a BID is to use the computing and process- 
ing power of a microprocessor to detect failures of: 

• Gentral Processing Unit (GPU) 

• Memory 

• Gontrol and Data Bus 

• A/D Interface 

• Software 

These five items are the key components of the EG/LSS G/M I. In addition, the 
BID can be designed to monitor critical process parameters directly. 

A second built-in self-test function is performed by the BIG circuit. The 
concept of BIG is to perform active functional checkout testing of signal 
conditioners . 

Both the BID and BIG circuits basically provide fault detection functions. 

Fault tolerance is achieved only when the BID/BIG are used with a redundant or 
spare component with reconfiguration logic circuits. 

Fault isolation and maintenance aids can be provided by BID and BIG when 
additional circuits are incorporated to communicate with the operator and 
indicate the failed components, A typical application is shown in Figure 11. 

As a part of this program effort, a BID circuit was designed and demonstrated. 
This will be discussed later in this report. 

Actuator Fault Prediction 

A mechanical device usually produces sound and vibration when running. The 
characteristics of the sound or vibration reflect the uniqueness of the device 
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FIGURE 7 FAULT DETECTION AND TOLERANCE USING BIC/BID 
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FIGURE 8 RELIABILITY CURVES 
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FIGURE 9 RELIABILITY COMPARISON EXAMPLE 
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FIGURE 10 MICROPROCESSOR-BASED BID CIRCUIT 
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and its running condition. In other words, the sound or vibration produced by 
a machine contains information that can be used to judge the condition of the 
machine. There are some commercially available vibration monitors using velo- 
city and accelerometer type vibration transducers. Monitoring the vibration 
pattern requires a number of transducers installed at different places in the 
machine to detect vibrations in different directions and of different parts. 

Most vibration cctnsors are invasive and have to be attached to internal parts 
of the machine. Sound monitoring has the advantage of not requiring many sen- 
sors distributed in the system and not invasive; but it has the disadvantage of 
being more susceptible to background noises. An audio signature analyzer which 
utilizes pattern recognition techniques^ ^ to detect or predict the fault of a 
running machine by its audio signature will have a better maintainability and, 
because of the fewer transducers involved, a better hardware reliability. 

Figure 12 explains the concept of the analyzer. Microphones or audio sensors 
are used to obtain the audio waveform generated by the running device such as 
a motor. The audio information passing through the signal conditioning stage 
and digitization stage is converted into digital signals and then sent into a 
computer for processing. The pattern recognition technique includes two steps. 
In the first step, the computer extracts the information from an input pattern. 
The extracted i.uformation expressed as numbers, usually called features, should 
contain the unique characteristics of the input pattern. In the second step, 
the computer analyzes the pattern features and outputs its judgment. Figure 13 
presents the block diagram of the process. 

Audio Pattern Recognition . An audio waveform generated by a running machine 
is a time-varying function. At any time instant, the audio pattern can be 
analyzed as a function of frequency (an audio spectrum). Figure 14 illustrates 
an audio pattern in three-dimensional space. For simplicity of discussion, the 
audio spectrum is assumed to be stationary, that is, the audio spectrum is inde- 
pendent of time. 


For a computer to process the signal, the continuous audio spectrum A^^, see 
Figure 14, should be quantized along the frequency axis. Then, the magnitude 
at each frequency should be converted into a digital number. The pattern can 
now be described as a discrete pattern D^. which is a vector of k variables. 


D 


t 





( 1 ) 


Because of random noise, the distribution of the discrete patterns of normal 
conditions usually forms a k-dimensional Gaussian function. Figure 15 shows 
the probability density function of one-dimensional Gaussian distribution... 

The probability density function of k-dimensional Gaussian distribution^ ^ is 
as follows: 


=,„k_k 


(2V)^|a 


rj ,axp 


C2) 


A = Covariance Matrix 

= Pattern Mean = ^f2 • 

T = Transpose of Matrix 



Figure 16 shows the probability density functions of normal patterns and abnor- 
mal patterns. If the pattern occurs in the normal region, it implies that the 
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Microphone 



FIGURE 12 AN AUDIO ANALYZER 
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FIGURE 13 BLOCK DIAGRAM OF AUDIO ANALYZER 
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P(M) =-^exp [-^(M-m)'] 

h - Deviation 
m = Mean 


FIGURE 15 ONE-DIMENSIONAL GAUSSIAN DISTRIBUTION 
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FIGURE 16 CLASSIFICATION OF PATTEPJ^S 
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probability of the machine being normal is higher than being abnormal. For 
rainiraiun risk, the computer is instructed to classify a pattern which falls in 
normal regions as normal and which falls outside normal regions as abnormal. 
This is called the Bayesian classification rule. 

To simplify computation and to minimize memory size, a weighted distance 
method is suggested. In this method, when the weighted distance between the 
pattern and the center of distribution is greater than a safety distance S, 
the pattern is recognized as normal; otherwise, abnormal (see Figure l6). In 
other words, a safe region is made around the center of distribution of normal 
pattern. The pattern is considered normal and safe within this region. The 
decision rules can be described by the following inequalities: 

k 0 0 

If T '^i^^tfi’^fi^ - ’ pattern is normal (3) 

otherwise, abnormal 
where is the mean of variable see equation (2) 

If the distributions of normal and abnormal patterns are far apart or the 
pattern distributions are special, the setpoint classification method may find 
possible use as the recognition scheme. The setpoint classification method is 
the method used in the present Model 100 C/M I. However, if the distribution 
functions of normal and abnormal patterns are Gaussian (normal distribution) 
and close to each other, the setpoint classification method will be inaccurate. 

The recognition principles introduced in this section have been successfully 
used in various applications.^ ’ ^ 

It is often desired to know more about the abnormal patterns of a running 
machine. For instance, it is helpful to know the possible causes of the 
abnormality. The abnormal patterns can be divided into many classes with 
respect to the causes. Each. class has a distribution function. Multiclass 
classification techniques^ ^ can then be applied. 

Computer Learning . For the computer to operate as desired, it has to know 
where the safe region is or what the normal and abnormal parameters of the dis- 
tribution function are. There are usually two ways of teaching the computers. 
One is the preset method and the other is by computer learning. In the preset 
method, the parameters are preset when the system is built. The computer 
learning capability allows the audio analyzer to infer the parameters by 
itself. Since the audio patterns are affected by the housing of the machine 
or the environment such as echo and other noises, the normal audio patterns of 
the same machine might be slightly different from system to system. The 
computer learning capability, thus„/ appeared to be more desirable. However, 
the available learning techniques^ ^ normally require long computation time 
and large memory. 

A compromise is to use a development system to do the computer learning at the 
installation stage. Thereafter, the parameters are set for that particular 
running machine in that particular environment and the computer learning 
software is no longer needed in the audio analyzer memory. Such a special 
purpose signature analyzer can be called a noise watcher, which watches the 
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noise generated by the machine and provides warnings when the noise indicates 
that something is wrong. 

The computer learning processes are usually categorized as sequential and non- 
sequential. The sequential procedures are more complex but require less memory 
for storing learning data. In the sequential procedure learning data are used 
and dropped one by one to upgrade the existing parameters. In the nonsequential 
procedure the learning data is collected, stored and processed at the same time. 
Although both methods will obtain the statistics from the learning samples, the 
sequential procedures fit our needs better since the computer can keep learning 
until the parameters are satisfactory without increasing the memory for learning 
samples . 

Nonstationary Audio Pattern . In the previous discussion, the audio spectra of 
normal patterns are assumed to be stationary. This assumption reduces the fea- 
tures required to recognize a pattern and makes the analysis simpler. If a 
normal audio spectrum happens to be time-varying and repetitive, further re- 
search on sampling rate and periodicity is necessary. Also, the beginning 
sampling time should be studied to avoid the trouble caused by time shift. A 
periodic audio spectrum will inevitably require more features to characterize 
the pattern. 

The time information Is very important in performance trend analysis for fault 
prediction. As a simple example, if the noise of the running motor gets louder 
and louder, something must be going wrong with the motor. 

Dimensionality . As mentioned before, the pattern is expressed by a vector which 
has many variables. Each of these variables is the loudness of the sound at a 
certain frequency. The number of variables required greatly affects the compu- 
tation complexity and the memory requirement. Therefore, it is necessary to 
select some particular frequencies where the most important information can be 
extracted. The procedure is called feature selection.^ ^ Feature selection 
is very important to the performance of the audio signature analyzer. A feasi- 
bility study using existing selection techniques is necessary. If few variables 
are necessary to reach a very high recognition accuracy, a small sized computer 
will be sufficient to implement such a system. If a large number of features 
are required to reach a reasonable recognition accuracy, a larger computer may 
be needed. If the recognition accuracy cannot be met by audio sensors, more 
information provided by vibration sensors or some other invasive sensors may be 
necessary. In the worst case, the analyzer could be too complex to be feasible. 

A feasibility study will require an audio sensor, a spectrum analyzer,^ and 
several mechanical rotational machines of interest that are modifiable to pro- 
duce abnormal patterns. 

Calibration Curve Retention 


When a sensor in a flight hardware C/M I is faulty it will be isolated and re- 
placed with a spare sensor. The calibration curve of the spare sensor is not 
necessarily the same; as the original one. The methods for and size impacts of 
retaining calibracion curves in computer memory for spare sensors have been 
studied. The techniques of data compression and information coding are required. 
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Calibration Curve Storage T e chnique s. Four calibration curve storage techni- 
ques are presented in this section. ’ In ideal cases the calibration curves can 
be represented by mathematical models ranging from straight lines to complex 
nonlinear parametric functions. VJhen these mathematical models are not appli- 
cable, point-by-point storage can be used to represent the irregular curves. 
These methods are described in the following order of increasing complexity. 

1. Linear Approximation . These calibration curves are represented by 
straight lines. This is the simplest method. 

2. Nonlinear Parametric Approximation . The calibration curves are 
approximated by N-degree polynomial functions, exponential functions, 
or some other parametric functions in this method. The total number 
of parameters required is not large, The memory requirement will 
not be a problem. 

3. Piecewise Linear Approximation . In this method each calibration 
curve is approximated by a number of linear functions, their coeffi- 
cients are then stored in the memory for processing. This method 
can easily be applied to all kinds of curves, but the number of 
linear functions affects the approximation accuracy and the memory 
efficiency. 

4. P oint-By-Point Storage . With this method the calibration curves are 
stored point-by-point in the memory. This method requires more 
memory than those mentioned before. The interpolation method or 
weighted average method can be used to calculate the value between 
points . 

Different techniques are available for point-by-point storage of 
calibration curves. Fortunately, the sensor calibration curves 
normally do not require these techniques. 

Storage Location Option s. If the calibration curves can be approximated by 
functions with a small number of parameters it does not require a large memory 
space to store the parameters for the spare sensors. The calibration curve 
could be stored in the following locations: 

1. Main Memory . The erasable/prograramable read-only memory (EPROM) of 
the main computer memory can be used to store the curves. 

2. Secondary Memory . For laboratory R&D C/M I's, the disk or cassette 
can be used to store a large amount of data. But they are not 
feasible for flight hardware applications because of the physical 
size. The bubble memory is a viable candidate for calibration curve 
retention. 

3. At the Sensor Site . The calibration mechanism may be implemented as 
a part of the sensor with a microprocessor at the sensor site. In 
this approach the signal conditioning, digitization, multiplexing 
and calibration are distributed to the sensor sites. The signal 
transmitted to the computer will be digitized and calibrated. 


31 



Cife Systems, Jmc, 


The ARS Sensoc Calibration Curves . The method used for storing a calibration 
curve depends on the characteristics of the calibration curve. In other words, 
the proper method of storing the calibration curve is sensor dependent. 

The sensors used in the ARS hardware and the recommended storage techniques are 
shown in Table 4, Most of them are linearly calibrated within specified ranges. 
For the rest, second order polynomials can approximate the curves very well 
except for the flow rate sensor, which may need third order polynomials or ex- 
ponential functions. 

Because the linear, second and third order polynomial functions only require 
two to four parameters, the memory size impact is nominal. Storing them in 
the EPROM of the C/M I main computer memory is feasible. Some of the ARS 
sensors will be digitized and multiplexed at the sensor sites. In this case, 
the calibration curves will naturally be stored at the sensor sites. 

A utomatic In Situ Sensor Calibrations 

The advantages and techniques of automatic in situ sensor calibrations were 
investigated. The sensor types used in an ARS are; 

• Current 

• Voltage 

• Temperature 

• Pressure 

• Flow Rate 

• Speed 

• Dew Point 

• Combustible Gas Concentration 

• Relative Humidity 

Advantages o f In Si tu Calibratio n, In an automated system such as the regenera- 
tive ARS, the system performance depends significantly on how well the sensors 
are calibrated, A sensor which is not calibrated may be hazardous to the system 
operation; the combustible gas concentration sensor is a good example. With in 
situ calibration capability, the sensors can be calibrated on-line frequently 
without the need for operator intervention. Automatic in situ calibration also 
avoids the operator errors implicit in a manual calibration. In summary, the 
advantages of automatic in situ calibration are; 

• Calibration without crew service 

• No human error 

• More accurate sensors 

A utomatic In Situ Calibration Techniques . Automatic in situ calibrations typi- 
cally require the generation of standard physical/cheraical conditions for the 
sensor signal conditioner adjustments. For example, a standard pressure is re- 
quired for zeroing the pressure sensor signal conditioner and another standard 
pressure is required for the adjustments of the signal conditioner span. 
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TABLE 4 RETENTION OF SPARE SENSOR CALIBRATION CURVES 


A. Sensor Types and Calibration Curves 



Current 
Voltage 
Temperature 
Pressure 
Flow Rate 


Speed 
Dew Point 
Combustible Gas 
RH 


Calibration Curve 

N/A (No Spares) 
N/A (No Spares) 
Linear 
Linear 

Second or Third 
Order Poly* 
Linear 
Linear 

Second Order Poly, 
N/A (Calculated, 

No Spares) 


B. Storage Location Options 

• Main Memory 

• Secondary Memory 

• Part of the Sensor 
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Generation of the standard physical /chemical calibration conditions is unique 
to each sensor type. The calibration sequences and the techniques after the 
standard conditions have been generated are common for all types of sensors. 

The common sequence of events in an automatic in situ calibration function is 
as follows; 

1. Generation of the standard sensor condition for signal conditioner 
zero adjustment. 

2. Adjustments of the electronic zero circuit in the sensor signal 
conditioner. 

3. Generation of the sensor condition for span adjustment. 

k. Adjustment of the electronic span circuit in the sensor signal 
conditioner. 

5. Sensor calibration completed; return to normal operation. 

The sequence of events described above is only applicable to the sensors which 
have linear calibration curves within the application range. For the .sensors 
with nonlinear calibration curves within the application range, more than two 
calibration points are required and the automatic in situ calibration capability 
should include the generation of multiple calibration conditions. For the re- 
generative ARS application, however, the linear approximation of sensor curves 
is acceptable. This means that only two calibration points are required during 
the automatic in situ calibration operation. The sequence of events described 
above may have to be repeated depending on (1) how critical the sensor is, and 
(2) how fast the sensor drifts. In general, it is adequate to repeat the cali- 
bration sequence once. That is, the sequence "put sensor off line, generate 
zero condition, adjust zero, generate span condition, adjust span, generate zero 
condition, adjust zero, generate span condition, adjust span and put sensor back 
on line" is adequate for most of the regenerative ARS sensors automatic in situ 
calibration. 

Auto matic In S itu Generat ion of Calibration Condi tio ns . To perform automatic 
in situ calibration, 'the calibration conditions such as reference voltage, 
standard temperature, standard pressure and standard combustible gas concen- 
tration are required. For some of the sensors, generation of the calibration 
conditions may be too difficult to be feasible. 

Table 5 summarizes the readily available calibration techniques for ARS appli- 
cation. Current and voltage sensors can be calibrated by using standard volt- 
age signals generated by using zener reference diodes. Automatic in situ cali- 
bration of speed sensors can be achieved by the generation of electrical pulses 
which simulate the optical/magnetic pickups from a speed sensor. Combustible 
gas concentration sensors can be calibrated using water vapor electrolysis (WVE) 
generated hydrogen (H 23 concentration in air.^ ^ 

Temperature, pressure, flow rate, dew point and relative humidity are difficult, 
without further development efforts, to calibrate with automatic in situ cali- 
bration functions. For the critical sensors in a system, the alternatives to 
avoid technical difficulties in automatic in situ calibrations are: 


34 




jCi/e Systems » Jhc, 


TABLE 5 AUTOMATIC IN SITU SENSOR CALIBRATION 



Feasible 



J]ensor^Tyi)e 

Calibration T<H*hni<iue 


A 1 1 er native 

Current 

Standard Reference Voltage 


— 

Voltage 

Standard Reference Voltage 



Temperature 

— , — 

Triple Redundant or Spare 

Pressure 

— 

Triple 

Redundant or Spare 

Flow Rate 

. — - 

Triple 

Redundant or Spare 

Speed 

Electrical Pulse Generator 



Dew Point 

— 

Triple 

Redundant or Spare 

Combustible Gas 

Water Vapor Electrolysis 


— — 

RH 


Triple 

Redundant or Spare 
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a. Use triple redundant sensors. 

b. Use a spare sensor with calibration curve stored in the computer 
memory. 

System Reliability and Maintainability Impact . Automatic in situ sensor 
calibration capability can increase system reliability and maintainability. 

The capability increases system reliability by avoiding shutdowns caused by 
uncalilsrated sensors. It increases the maintainability by the reduction or 
elimination of operator intervention during a calibration procedure, 

On-Line Real"Time Diagnosis 

A diagnostic program can be used by the C/M I computer to find out the cause 
of an oat-of-tolerance event. 

When an out-of-tolerance event occurs, the sensor values before, during and 
after the event usually show some implications to the cause of the event. Very 
often, an on-line real-time diagnostic is desired. This section discusses the 
concept and feasibility of implementing such a system. 

Approach 1 . When an out-of-tolerance event occurs, the computer automatically 
calls the diagnostic program which will then search for useful information 
through a certain period of sensor data history which has been stored in the 
computer memory. This approach requires a large memory for storing the sensor 
data. Some of the sensor data may not be useful for diagnosing that particular 
event. Also, the result of this diagnostic program may be too late to prevent 
the system from shutdown because the execution of such a diagnostic program 
requires time. This approach has an advantage, however, in that it does not 
require CPU time if no out-of-tolerance event occurs. 

Approach 2 . The diagnostic program runs all the time and keeps track of the 
system runni*g conditions. The system performing conditiopss can be divided 
into many stages, each stage is represented by a "state . ^ The event 
implications detected from the sensor data are described in the state, In 
other words, the dx« ic program is processing the sensor data all the time 
and may send out ea ly wa ning messages when the system enters cer^-’sin states. 
This approach has tv.- -id* .„ntages . One, it does not require a large memory 
size. The diagnostic program processes the data and drops out the useless 
data, extracts the important information and updates the present state. The 
sensor data hietury does not have to be stored in the memory. Two, the diag- 
nostic result is available whenever the event occurs unless the diagnostic 
requires some post-event information. This approach requires CPU time when 
the system is running. Tf the CPU has enough free time to run the diagnostic 
program, this requirement has no negative effect on the system. It becomes a 
drawback, however, when the CPU time is not sufficient to run the diagnostic 
program. 

Comparing the above two approaches, the second one is better than the first 
one if the CPU is not overloaded by the diagnostic program. The efficiency 
and accuracy of such a diagnostic program is strongly related with the nature 
and characteristics of the mechanical system. If the implication of the event 
is very complicated and ambiguous, an accurate diagnostic program will be very 
difficult to implement. A diagnostic program itself may occupy a large memory 
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space and require long computation time. If the implication of events is simple, 
clear and easy to be divided into a small number of states, the diagnostic 
program implemented will be efficient and accurate. 

The following procedure describes how to implement such a diagnostic program 
using Approach 2; 

1. The nature and characteristics of the parts and function of the 
mechanical system should be fully understood. The implication in 
the sensor data of the time proceeding and following any out-of- 
tolerance event and the corresponding causes should be analyzed and 
fully understood. 

2. The implication should be studied in detail and divided into repre- 

sentative states. Some descriptive variables (attributes) may be 
necessary to describe the state. The interrelationship among the 
states should also be defined. For example, may represent the 

current monotonously increasing at rate x, A may represent a 

voltage changing up and down alternately witn’period p and present 
changing rate x. 

3. Prepare the state transition diagram according to the interrelation- 
ship among the states. The computation of attributes when a transi- 
tion is made should also be formulated. For example. Figure 17 
shows a state transition diagram. Each circle represents a state. 

The double circle indicates the normal state. An input is repre- 
sented by an "i." Statements in lower half of the circles are the 
warning messages output by the computer at that state. The 'ondi- 
tions for a transition and the computation rules of the attru’^tes 
are given in Table 6. Figure 18 represents some possible sensor 
value curves. With the proposed diagnostic program, the curves are 
already characterized by the states when warning occurs. Storage 
for the past 16 sets of sensor data is no longer necessary. 

4. Write a diagnostic program that simulates the state transition 
diagram. Sugh^an algorithm is usually referred to as an "attributed 
automaton " ^ in Formal Languages and/or Artificial Intelligence. 
Using a high level language to implement such a software routine is 
recommended because of the simplicity of modifying the transition 
diagram. 

Approach 2 requires less memory for data storage than Approach 1 and may give 
warning early enough to prevent a hardware failure. The accuracy of diagnos- 
ing results depends on the completeness of the transition diagram. The com- 
plexity of the transition diagram determines memory size requirements and 
software implementation effort. 


DEMONSTRATION 

As a part of the program effort, the design and demonstration of advanced fault 
diagnostic instrumentation concepts were com^.leted. These concepts can become 
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TABLE 6 TRANSITION CONDITIONS AND ATTRIBUTE RULES 

Attribute 


Transition 

Conditions 

Computation Rules 

1 

"■1 ^ ^ 

Iq - i 

2 

^ i "hi 

X = i “ i^ 

Iq = 1 

3 

ii < i < 

X = i “ i^ 
io = i 

4 

" ■ "o ^ "h3 ■ "hi 

^0 ^ 

5 

"h2 " 1 ■ " ihl 

Xq - 1 

6 

" - "h3 

^ ^0 
Iq = X 

7 

"h3 ^ ^ "h2 

"}r ^0 
^0 ^ 

8 

" ' "o ^ ’■h 

V." ^0 

Iq - 1 

9 

" ^ ^3 


10 

‘h3 ^ ^ "O ^ \2 

^ ^0 

11 

" " "h2 

Po 

=_\" ^0 

12 

" "h2 

p=p+l,x=i-iQ 

Po = P. io = " 

13 

"h3 " " ^ "h2 

p = p + 1 
^ =_\" ^0 

14 

ih2 "■ " 

Pq = p or (p + Pq)^s 

p = 1 


continued- 
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Table 6 - continued 


Transition 


Attribute 


Conditions 


15 

^h3 ^ ^ ^ \2 

p = p + 1 
X = i - 1 q 


16 

^ ^ ’■hS 

Po = por(p + 
X = i - iQ, ip 

= 1 

17 

^ ^ ^h3 

Pq =,p Cp 

1 + Pq)'^ 


1 - 1 , 
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standards for the overall EC/LSS hardware development. Two specific concepts 
were developed to the stage of laboratory demonstration. They are: (a) inter- 
computer Communication protocols with error detection capability and (b) 
microprocessor-based BID circuit. 

Intercomputer Communication with Error Detection 

Two intercomputer communication protocols were established for communication 
between the computers. These computer links render a Computer Automation LSI-2 
minicomputer capable of communicating with another LSI-2 minicomputer or an 
Intel 8-bit microcomputer through the standard Electronic Industries Association 
tEIA) RS-232C interface (an industry standard bit-serial digital transmission 
protocol) with transmission error checking capability. A typical application 
of the computer links is the connection between subsystem instrmaentation and 
the central computer. Another example of application is the connection of an 
instrumentation prototype to a DARS during the R&D phase of the development. 

This connection is presently done with a pair of byte-parallel general purpose 
picoprocessors , for which the cable length is limited to 15 feet. Using the 
RS-232C link, the picoprocessors are eliminated and the cable length is increased 
to 100 feet. 

Link 1: Betwe en LSI-2 Computers 

For two computers to talk with each other through the EIA RS-232C port, their 
selection of parity checking, stop bit, character length, and the transmission 
baud rate (bits/second) should be the same. The communication link for the 
two LSI-2 computers requires a cable of five conductors: the signal received 
(EIAR), transmitted (EIAT) , Request to Send (RTS), Clear to Send (CTS) and a 
common ground between the two interfaces. The^protocol designed requires that 
the interface time delay circuit be disabled . 

The transmission can be done by a simple program as shown in Figure 19. Using 
these instructions, however, the receiver should always be active prior to the 
transmitter. To avoid this timing problem, the handshaking instructions 
between the two computers and the time delay instructions in the transmitter 
can be added before enabling the interrupt, The receiver outputs a logic one 
to the RTS line (CTS line of transmitter) and then waits for the buffer to be 
filled with input infoi'mation. The transmitter keeps waiting and sensing 
until CTS is logic one and then starts the output sequence. After this hand- 
shaking the transmitter uses a counter to delay itself several instructions in 
order to ensure that the receiver becomes active first and then starts automatic 
I/O. At the end of the automatic I/O operation a checksum of the transmitted 
data is generated by the receiver and compar.“d to one generated and transmitted 
by the transmitter. This protocol is shown in the flow charts in Figure 20. 

The checksum represents the sum of all the transferred words with the carries 
ignored. The checksum is used to detect the transmission error. Using this 
protocol, two computers can communicate with each other while both systems are 
running. Before the transmission begins the checksum is calculated by the 
transmitter and transferred to the receiver as two bytes during handshaking. 

At the end of the automatic I/O the system will call the End of Block (EOB) 
subroutines. In the EOB subroutines the byte counts are checked at both sites 
and the checksum is calculated at the receiver site and compared with the 
transferred checksum to detect transmission errors. If any error is detected 
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FIGURE 19 A SIMPLE COMPUTER COMMUNICATION PROGRAM 







Cifc Systems, Jhc. 


EIAAIN 


EIAAOU 


S«t Up 

Interrupt Vector 


Set Up 

Interrupt Vector 


Handshake and 
Input Checksum 


Calculate 

Checksum 


Initialize EIA Controller 
and Enable Interrupt 


Handshake and 
Output Checksum 


Normal 

Return 


Walt 16 
Instructions 


EOBAIN 


Save Status 


Is Byte 
Count Zero 
? 


Calculate 

Checksum 


Is Checksum 
Correct ? 


Initialize EIA Controllsr 
and Enable Interrupt 


Normal 

Return 


AOBAOU 


Save Status 


Is Byte 
Count Zero 
? 



Restore 

Status 




Restore 

Status 



Restore 

Status 


Restore 

Status 



44 

















Ofe Systems, Jhc. 


it will jiump to the error return location. The checksum calculation, storage 
and transmission are transparent to the user. The user only needs to supply 
the minus byte count and the byte address of the I/O buffer minus one, and two 
locations for normal and error returns. From the error return, the user may 
discard the transferred data and request another transfer. 

Link 2; Between LSI-2 and Intel 8085 Microcomputer 

This link is designed to connect the EIA RS-232C port of the LSI-2 minicomputer 
and the CRT port of the Intel microcomputer development system (MDS) . The 
Intel ^^DS is wired so that the CTS and RTS are not accessible from the CRT 
port. Therefore the Data Set Ready (DSR) and Data Terminal Ready (DTR) signals 
are used instead. As mentioned previously, the selections of parity check, 
stop bit, character length and transmission baud rate should be the same for 
two computers to communicate through the EIA port. The Intel MDS can set/reset 
the values of DSR and DTR. It is the user's responsibility to set the signals 
of DSR and DTR to zero at the microcomputer site before and after the transfer. 
The handshaking of the two computers is started from the MDS site. After hand- 
shaking the transmission starts and transfers information byte by byte. At 
the transmitter site the checksum byte is calculated before the transmission 
and transferred as the last byte. The receiver will receive all the informa- 
tion bytes and the checksum byte, and then will calculate the checksum and 
compares the results. The LSI-2 will jump to the error return location if a 
transmission error is detected after receiving data. The microcomputer will 
have the accumulator equal to one when returning to caller routine if error is 
detected after receiving data. The checksum operation is transparent to 
users. The user should provide the negative byte count and byte address of 
the I/O buffer. At the LSI-2 site the user should also provide two return 
locations. At the Intel MDS site the information should be passed to the I/O 
routines by registers; it is the user’s responsibility to check the accumulator 
for error information after receiving data. 

The described two links not only can transfer data between computers but also 
detect the transmission error. The byte count checking and the checksum 
operation programmed in the I/O subroutines detect the transmission errors. 
These operations are transparent to users and assure the transmission accuracy. 

Built-in Diagnostic Circuit 

An Intel 8085 microprocessor-based circuit is designed to perform BID functions 
for a microcomputer or minicomputer implemented C/M I. The BID circuit has 
the following functions: 

• Requests the C/M I computer to initiate its Self-Diagnostic Program 
CSDP) which verifies the functions of the CPU, memory and data bus. 

• Interprets the SDP results as transmitted from the C/M I to the 
BID circuit. 

• Monitors the A/D interface card function. 
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• Monitors the G/M I computer to verify that it is active. 

• Monitors, as a redundant fault detection function, up to two critical 
process parameters directly (bypassing the computer and A/D interface), 

• Outputs an emergency shutdown request signal when critical malfunctions 
have been detected. 

The block diagram of the BID circuit was shown previously in Figure 10. 

Figure 21 shows the actual circuit card layout of the BID. 

Har dware D esign 

The BID circuit, as shown in Figure 22, was designed with state-of-the-art micro- 
processor technology. The heart of the BID is an Intel 8085) microprocessor 
which is supported by an Intel 8155 RAM/IO (Random Access Memory/ Input-Output) , 
a 2/16 F.PROM and an 8212 I/O. With this electronic configuration, the BID has 
a memory capacity of 2K bytes of EPROM and 256 bytes of RAM. It has 22 general 
purpose I/O lines divided into three ports. A comparator logic is used to 
monitor the critical process parameters. The two parameters are compared with 
high and low setpoints which are set at the absolute tolerance levels. The 
setpoints are adjustable by on-card potentiometers. The results of the compara- 
tor logic are sent to the microprocessor for a diagnostic decision. 

To detect malfunctions of the CPU and A/D interface of the C/M I, the memory 
reference control (read or write) signals and a software controlled A/D inter- 
face pulse train are checked by the BID circuit. The checking is accomplished 
by using retriggerable multivibrators. The retriggerable multivibrators are 
triggered at a constant frequency when the C/M I and A/D interface are func- 
tioning properly. The pulse width of the one-shot are designed to be greater 
than the triggering periods. The one-shot outputs should always be at the 
triggered state until the pulses stop. The microprocessor will monitor the 
output levels which indicate whether the C/M I computer or A/D interface card 
are functioning properly or not. 

So f twar e De sign 

The BID software program consists of the following; 

• Power-up Routine 

• Timer Interrupt Routine 

• Monitor A which checks the digital input signals to verify that the 
critical process parameters are within tolerance range, C/M I com- 
puters are running and A/D interface is outputting pulses 

• Monitor B which signals the C/M I computer for the initiation of the 
SDP inside the C/M I and interprets the SDP results 

The functions which Monitor A performs are designed to ensure that the C/M 1 
computer is running, the computer can output valid data through the A/D inter- 
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FIGURE 21 MICROPROCESSOR-BASED BID CIRCUIT CARD 










Clfe Systems, Jhc, 



48 


figure 22 BID CIRCUIT 
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face and critical process parameters are within the safety range. If any of 
the above is not true the BID circuit will request an emergency shutdown. In 
some cases the emergency shutdown sequence may be a one step "power off" 
operation. In others, the sequence may consist of actuators off, purging and 
placing valves to shutdown positions. For the latter case, an emergency shut- 
down controller activated by the BID and other diagnostic/checkout circuits 
should be incorporated. 

Monitor B is a diagnostic program which requests the C/M I computer to start 
self-diagnosis by outputting a request signal (logic one) called Computer Test 
Request (CTR). The C/M I computer, upon receipt of the CTR signal, will start 
a self-diagnostic sequence to check out the functions of the CPU, memory and 
data bus. The SDP program inside the C/M I is driven by the C/M I Real Time 
Executive (RTE) program. It runs through a computer checkout routine and, at 
the end of the checkout, outputs a 7-bit result to the BID. The SDP consists 
of four steps: 

a. Buffer and bus line checkout. Different bit patterns are used to 
check whether the address, data or control lines are stuck at a 
logic one or logic zero. 

b. CPU checkout. Starts with a specific data byte and runs through 
major instructions, such as load, add, shift, complement and store, 
to obtain a result data byte. This result byte will then be sent 
back to the BID for interpretation and diagnosis. Only the BID can 
make a judgment of whether the C/M I has successfully passed the 
self-diagnostic procedure. 

c. EPROM checkout. Uses checksum method to check out the EPROM memory. 

A checksum word (or byte) was previously generated and stored for a 
block of EPROM at the time the program was created. To verify that 
the CPU and EPROM are functioning properly the checksum is regenerated 
at run-time and compared with the stored value. The newly generated 
checksum should always be identical to the previously generated one 

or a computer or memory failure has occurred. 

d. RAM checkout. Checks out the RAM by using the sequence of read and 
store the data, write a specific data, read and check the itten 
data and then store the original data. 

When an error is detected in the four steps a coded bit pattern will be sent 
to the BID to indicate the cause of the failure. Otherwise, the result byte 
of step (b) will be sent to the BID. A Computer Test Completed (CTC) signal 
is sent from the C/M I (logic one means test completed) to the BID at the same 
time. When the BID concludes that the C/M I has had a malfunction it will 
request an emergency shutdown. 

Laborato. * Demonstration Observation 

The deiiiuuhttated communicdtion protocol allows reliable data/information to be 
transferred between computers, A secondary benefit of this demonstration is 
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the readiness of the bit-serial RS-232C communication design for central/siub- 
systeni and subsystem/DARS interface. The previously designed byte-parallel 
protocol is a simpler operation transferring parallel signals over a 48-conduc- 
tor ribbon cable which has a length limitation of 15 feet. The bit-serial 
RS-232C protocol uses a cable of five conductors and has a longer allowable 
length of 100 feet. The communication speed of the serial protocol, although 
slower than that of the parallel protocol, is still well within the EC/LSS 
application requirement. For example, in EC/LSS BARS application the speed 
requirement is one data si>t (64 points) per two minutes. The projected sub- 
system to central communication speed will be 64 bytes per second (based on 
the assumption that the central system will respond to a subsystem request 
within one second). The demonstrated 1,200 baud (bits/second) nominal speed 
and 9,600 baud maximum speed are more than twice and 18 times that required, 
respectively. 

The microprocessor-based BID circuit was demonstrated successfully with the 
Intel MDS In-Circuit Emulator (ICE). Some problems associated with erroneous 
data/control bus signals were observed when running with the actual 8085 CPU. 
The noises were attributed to crosstalk of the breadboard circuit wires. 
Elimination of the difficulties is projected v;hen the BID circuit is printed 
on a printed circuit (PC) card. 


CONCLUSIONS 

The following conclusions are a direct result of this instrumentation develop- 
ment program; 

1. Fault tolerance can be achieved by transmission error checking with 
software recovery algorithms, redundant components with comparison 
logic and redundant components with BID/BIC circuits. 

2. Reliability of a system with voting and spares within the system is 
inheiently higher than that of a group of four redundant systems 
with external voting logic. 

3. A microprocessor-based BID circuit card having dimensions of 11.4 cm 
X 11.2 cm (4.5 in x 4.4 in) is feasible. It can detect failures of 
CPU, memory, data bus, A/D interface and software in addition to 
monitoring critical process parameters. 

4. The technology of vibration signature analysis for actuator fault 
detection and prediction is available but the difficulty of vibra- 
tion sensor installation is a road block. Audio signature analysis 
instead of vibration signature analysis is more suitable for EC/LSS 
applications. More work i.s needed to engineer the audio signature 
analyzer for EC/LSS application. 

5. Most EC/LSS sensor calibration curves are linear straight lines 
(e.g., temperature, pressure, speed and dew point) or second/third 
order polynominal approximation (e.g., flow rate and combustible gas 
concentration). Storage of spare sensor calibration curves is, 
thus, not a difficult task. The retrieval of these calibration 
curves is more difficult. The ultimate answer to calibration curve 
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retention and retrieval for spare sensors is to incorporate them at 
the sensor heads. 

6. Automatic lu citu sensor calihration techniques are readily applicable 
to current, voltage, speed and combustible gas concentration sensors. 
Its application to temperature, pressure, flow rate, dew point and 
relative humidity sensors is difficult without further work on in 
situ calibration condition generation. The alternatives are triple 
redundant sensors (high reliability) or spare sensors with built-in 
calibration curves (high maintainability). 

7. Recording and storing data for fault isolation requires a large size 
memory and is thus not feasible. This problem can be avoided by 
using the state transition technique. In this technique different 
states leading to a failure are defined and events that cause the 
syst. .11 to go from one state to another are identified as attributes. 
Only current state and attributes are stored. Thus, real-time 
diagnosis could be done with less memory requirement. The state 
transition technique also provides fault prediction capability in 
addition to fault isolation. 

RECOMMENDATIONS 

Additional development efforts should be directed towards selection and incor- 
poration of fault diagnostic functions available to EC/1S3 application. The 
intercomputer communication with error checking capability developed under 
this program should be used in program loading and future central/siibsystem 
instrumentation links. The BID circuit should be used in the future C/M I 
lOOA for fail-safe operation and in the C/M I 200 for fail-safe and ultimately 
fault tolerant operation. The design of the C/M I 200 should have the follow- 
ing sequence: 

1. Minimum Model 200 C/M I definition. Define the minimum C/N I needed 
to perform the process funct.-un on a continuous basis. The defini- 
tion should include the desired availability expressed in terms of 
MTBF and MTTR for the process and the major subassemblies. 

2. Capabilities versus penalties trade-off analy.sis. Any additional 
instrumentation hardware shall be quantified in terms of equivalent 
weight penalties. Only the added capabilities which can be justi- 
fied from a size view^point should be included. The design must 
provide for fail-safe process operation. The process subsystem 
should be designed for an availability level reflected in operating 
tor 90 days with three shutdowns, each being able to be repaired 
within two hour.s. The C/M I for the process subsystem should be 
designed to have no shutdowns for the 90 days of operation with 
replacement and maintenance allowed during the three shutdowns 
caused by process nialfunctions . Flexibility of C/M I designed for 
the C/M 1 100 series should be eliminated for reduction of size. 
Equivalent vs’cight penalties assuviated with spacecraft power and 
heat rejection will be assumed as follows: 0.27 kg/W (0.59 Ib/W) for 
DC power, 0.32 kg/W (0.71 Ib/W) for AC power, 0.08 kg/W (0.184 Ib/W) 
for heat rejection to liquid coolant and 0.20 kg/W (0.437 Ib/W) for 
heat rejection to air. 
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3. Design of advanced C/M I Model 200. This effort shall establish 
projected spacj flight mission criteria, with primary emphasis on 
mission length and quantified reliability requirements. Based on 
these numbers, failure rates for each of the LRUs in the subsystem 
should be established. The number of spares or redundancies should 
be defined to meet the reliability goal for a given time period. 

The added components, such as sensors and circuits within the C/M I 
to allow fault diagnostic functions, should be defined in order to 
meet the availability goal of the total subsystem. The .activity 
will result in a C/M I package ready for approval by NASA. 

4. Instrumentation mock-up effort. The mock-up of the flight instru- 
mentation design should answer the following questions: 

a. Which part of the C/M I should be removed from the instrumenta- 
tion enclosure and packaged closer to or integrated with the 
mechanical electrochemical assemblies? 

b. Which, if any, capability shall be packaged as a separate, 
in-flight diagnostic unit? This is to be differentiated from 
the more conventional TSA or flight-related ground support 
accessories . 

c. Will the power supply of the C/M I be maintainable at the 
subassembly or the component level? 

d. Can the power supply be fractionalized so that several power 
supplies exist for the subsystem? 

e. Wliat steps are needed to build the custom large-scale integrated 
circuits to allow incorporation of the above conceived capabili- 
ties into the future hardware at minimum penalty in reliability, 
size and estimated cost? 

f. What is the operator/ system interface and subsystem/ central 
interface projected for the flight hardware? 

g. What aspects of the developer's knowledge are projected to be 
incorporated into the flight hardware? 

h. Will multiplexing of signal conditioning be advantageous and 
feasible? 

5. Further advanced development. In future development efforts, a 
program is needed to lay out the routes for obtaining custom large- 
scale integrated circuits for EC/LSS applications. Selecting, 
designing and obtaining custom large-scale integrated circuits comes 
later when the flight mission is ready. Other development efforts 
should include the design of an electronic circuit to convert the 
Electrochemical Depolarized CO 2 Concentrator (EDC) generated power 
into usable electrical energy. This is not the same as power sharing 
between the EDC and the Water Electrolysis Subsystem (WES) because 
the EDC power can be used by other electrical devices in the absence 
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of the WES. Development of actuator failure prediction and detection 
using audio pattern recognition is recommended. Other advanced 
designs aiming at further reduction of the instrumentation size and 
power consumption of Model 200 G/M I is recommended. An example of 
the advanced designs is multiplexing and transmitting sensor data 
from the process assembly to the C/M I using a multiplexed transmitter. 
Evaluation and selection of available ruggedized electronics and 
space-borne packaging techniques are also strongly recommended. 
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